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Abstract 

We develop an abstract model of atomic clocks that describes the 
full dynamics of repeated synchronization of a classical oscillator with 
a quantum reference. We then focus on the stationary state of the 
model, in particular its dependence on the control scheme, the in- 
terrogation time and the stability of the oscillator. Control schemes 
are classified by autocorrelations of the frequency before and after the 
synchronization. This provides a one-parameter family of stationary 
states that differ in variance of the frequency, yet produce an equally 
good clock time. Formally, this is a consequence of a novel Cramer- 
Rao type inequality. We also derive an optimal interrogation time and 
show that it is determined by the balance between the dissipation of 
the oscillator and the information gained from the synchronization. 

1 Introduction 

After more than 60 years of development (see e.g. PU]), atomic clocks are 
not only standard experimental equipment, but they are also commercially 
employed (for example in GPS). In view of that, a study of a basic model of 
atomic clocks might seem redundant. Yet, to the best of our knowledge, a 
comprehensive mathematical description of an atomic clock operation, and 
in particular an analysis of the stationary state of the clock, does not exist. 
Indeed, the main focus of the theory of atomic clocks is on a single synchro- 
nization of the clock oscillator with the quantum reference: Improvement of 



the spectroscopy precision using entanglement [2], synchronization of clocks 
[12] and the role of decoherence [17] . On the other hand we aim to study the 
stationary operation of an atomic clock as achieved after a large number of 
repeated synchronizations. In particular, we describe how the frequency of 
the oscillator during the stationary operation is determined by the balance 
between the synchronization and the dissipation of the oscillator. 

We shortly describe how an atomic clock operates. A clock is any device 
that uses the periodic motion of an oscillator to measure time. It counts 
beats of the oscillator and displays them in convenient units. Provided that 
it counts correctly, accuracy of the clock is determined by properties of its 
oscillator, the most important being its mean frequency and its stability. To 
increase the stability one often employs a frequency reference through the 
following control scheme. 

The frequency reference is a physical process with a stable characteristic 
frequency that can be observed repeatedly. The oscillator is tuned to this 
frequency and then continuously (or in discrete time intervals) synchronized 
with the reference. This stabilizes the oscillator given that the reference 
frequency is more stable than the oscillator itself. There is always some 
obstruction in observing the frequency reference that prevents perfect syn- 
chronization. We proceed by giving two examples. 

A somewhat fairy tale example is a pendulum (clock) with adjustable 
length that is tuned to the frequency of a chosen pulsar. The pulsar is the 
frequency reference and at anytime we observe the pulsar in the sky we adjust 
the length of the pendulum to match the frequency of the pulsar. 

The next example is less artificial and more relevant for our study. We 
describe a Cesium atomic clock: A voltage-controlled quartz crystal oscillator 
tuned to a frequency of transition between two hyperfine split levels of the 
Cesium atom, a ground state \g) and an excited state |e) separated by an 
energy gap uq. The free Hamiltonian of the atom being Hq = ijJoaz/2. 

A quartz crystal with frequency u{t) operates an electro-magnetic field 
in an U-shaped cavity (see fig. [T]) so that a Cesium atom passing through the 
cavity experiences an interaction Hamiltonian 

Hi{t) = -liBit) ■ a. Bit) = B{smu{t), cosw(t), 0). 

The strength of the magnetic field B can be adjusted so that a Cesium atom 
entering the left part of the cavity in the ground state would exit the cavity 
in the Bloch superposition, e.g. {\g) + \e))/^/2. During the flight between 
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the cavity ends (of duration r) the state acquires a relative phase 



(wo — a;(s))ds. 



Upon exiting the right side of the cavity the probabihty of finding the atom 
in the excited state is -P(e) = cos^(/ (wq — uj{s)) /2). Using a beam of Cesium 
atoms one can then adjust a vohage controlhng the quartz crystal to achieve 
Pie) = 1. 
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Figure 1: A control scheme of the Cesium atomic clock. A quartz crystal 
operates an electromagnetic field inside a cavity. A beam of Cesium atoms 
passes through the cavity into a detector and provides information on the 
frequency difference between the quartz crystal frequency and the Cesium 
atom reference frequency. This information is used in a loop to control the 
quartz crystal in order to make the difference zero. 

The above synchronization procedure employs the Ramsey interferometry 
(see fig. [T]) and represents a set-up that includes the preparation of a state, 
its evolution for time r and a measurement, the evolution being described 
in a rotating frame by a Hamiltonian H{t) = [uq — u{t))H. Regardless of 
details of the experimental setup , it is clear that information about uo — u{t) 
can be acquired given knowledge of H and r. This is a standard starting 
point for an abstract model of an atomic clock. 

We call any clock with a quantum mechanical frequency reference an 
atomic clock. It is a device that synchronizes a classical oscillator with a 
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Figure 2: Bloch representation of the state of a Cesium atom during the 
Ramsey interferometry. Top left: Before entering a cavity the atom is in 
the ground state. Top Right: In between cavity ends the state is on the 
equator and undergoes Bloch oscillations with frequency cuq. Bottom: The 
state before detection has an angle with the excited state proportional to the 
acquired relative phase J {ujq — u!{s)). 

quantum frequency reference and its clock time is given by counting the 
former classical oscillations. The quantum mechanical nature of the reference 
makes perfect synchronization impossible due to the fundamental uncertainty 
in observing the reference. Furthermore, the observation disturbs the state 
of the frequency reference and as a consequence the reference needs to be 
reseted in between every use. The resulting entropy production has to be 
viewed as an additional cost of the synchronization. 

If we had a perfect classical oscillator there would be no need for continu- 
ous synchronization with a reference. It is therefore clear that the stability of 
the clock's oscillator should play a major role in determining the stationary 
operation of the atomic clock; A stable oscillator requires less synchroniza- 
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tion than an unstable oscillator in order to achieve the same accuracy of 
the clock. By a standard diffusion-dissipation relation one guesses that an 
optimal operation should be characterized by a balance between the dissipa- 
tion of the oscillator and the information gained by the synchronization. To 
rigorously establish such a relation is one of the main goals of this article. 

There are many types of atomic clocks differing both in the classical 
oscillator and the quantum frequency reference. We do not aim to discuss 
any particular type and motivated by the example of the Cesium clock we 
establish the following abstract model. 

We decompose the frequency of the oscillator, uit), into the reference 
frequency, Uq, and an instantaneous error ip(t), 

Uj{t) = Wo + 

In absence of synchronization the frequency of the oscillator (and hence the 
error) is described by a classical stochastic process. The quantum frequency 
reference is modeled by a V9(t)-dependent quantum state p. At discrete times, 
a measurement of the quantum state provides an estimation of ip{t) and an 
adjustment of uj{t) towards the intended frequency ujq is made according to 
the outcome. 

In more details, the synchronization process consists of three steps: (We 
describe it at t = 0) 

1. The reference system is initialized to a state po and evolves freely for 
an interrogation time r, 

Po ^ p(r^) = e-^-^^poe^-^^, (^^ := ^ v{s)ds^ , 

where if is a known Hamiltonian. 

2. Subsequent measurement provides an estimation, of the time aver- 
aged frequency error, ip . The frequency is then adjusted according to 
the outcome, 

y?(r) ^ y?(r) - ^. (1) 

3. The procedure is repeated with a period T > t. 

A clock model (see Section |5| is defined by: a stochastic evolution in ab- 
sence of synchronization, a family of states p{ip), an estimation strategy and 
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two timescales r, T. The estimation strategy consists of a POVM measure- 
ment n(a) and an estimator $. The estimator describes the post-processing 
of a measurement outcome, upon an outcome a a guess $(0;) of the unknown 
ip is made. We caU the stochastic process (p{t), the state of the clock. Several 
remarks are in order. 

Remark 1 The Hamiltonian evolution in the first step will he replaced by a 
general ip dependent family of states, p{T(f). This in particular incorporates 
an open system evolution (e.g. dephasing) of the quantum reference. 

Note that p is dependent on the time average frequency (p rather than the 
instantaneous frequency ipij). This discrepancy gives an additional error to 
the synchronization. For a continuous process ip{t) the error should disappear 
in the limit r — )■ 0. 

It is conceivable ( and also realistic ) that the control scheme at first collects 
data from several consecutive measurements before adjusting the frequency. 
We do not discuss such a generalization here. 

The clock time within our model is given by 



It follows that the accuracy of clock is determined by a relative frequency 
error ip{t)/u!o. In particular a high frequency reference is superior to a low 
frequency reference provided they are equal in all other aspects. 

Our goal is to describe an atomic clock in a well defined mathematical 
manner and to pose several interesting problems related to its operation. 
Among the problems are: 

1. Existence of a stationary state of the clock. 

2. Properties of the stationary state, especially the variance of the asso- 
ciated clock time. 

3. Entropy production of the clock. 

In this article we first develop the model in its full generality and then 
focus on the second question. We derive Cramer-Rao type bounds for the 
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variance of the stationary state (f{t) and for the variance of the associated 
clock time. We show that the former depends considerably on the estimation 
strategy, while the latter is to high extend universal. We now describe our 
results in details. 

We focus on a special family of Gaussian unbiased clocks (see Section |5|. 
This is a natural family of clocks that are guaranteed to have an unbiased 
clock time, E[tc«ocA;] = t. In particular, an unbiased clock has the property 
that E[(y9(s)] = implies E[(^(t)] = for all t > s. This holds true if and only 
if the estimation strategy is a multiple (denoted by 1 — C) of an unbiased 
estimation strategy and the process (p(t) in absence of the synchronization 
is a martingale. Gaussian clocks are a special example of the latter: the 
stochastic process in absence of synchronization is a ID-random walk with a 
diffusion constant D. Whenever we want to stress a particular value of ( we 
speak about ^-unbiased clocks. 

We study a stationary state (p(t) of a Gaussian ^-unbiased atomic clock, 
i.e. a state for which ip{t + riT), G N is a stationary process. The param- 
eter ( effect the autocorrelation of (p{t) through the adjustment ([T]) and is 
equivalently described by an exponentially decaying autocorrelation of the 
stationary state 

The natural case C = (unbiased estimation) corresponds to a solution with 
no correlations over periods longer then T, however we show that whole inter- 
val ( G (—1, 1) needs to be taken into account when optimizing performance 
of the clock. This would be apparent from the variance of the clock time 
and the variance of the stationary state (p{t). We describe these for the case 

T = T. 

Let be a variance of the stationary state ip{t) immediately after the 
synchronization. Then we derive a bound 



2 ^ 1 - C 1 , r^rr. 9(0 



> -^r— + DT- 



1 + CFT2 (1 _()(( + !)' 

where F is the Fisher information of the family p{ip) and D is a diffusion 
constant of the oscillator in absence of synchronization. Here C, G (—1, 1) and 
g{C) is bounded in this region and non vanishing at C = =tl- Minimizing the 
variance (if desirable) singles out an optimal C,. Note that C = 1 eliminates 
the first term at the expense of making the second term infinite. We show 
that for an optimal interrogation time (see below) the optimal parameter C, 
is given by C ~ 0.35. 
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The clock time associated to such a solution satisfies a bound 



where /(C) is bounded and positive in the closed interval [—1, 1]. Note that 
for a clock time error the first term is C, independent. For a fixed F and 
C, we find that an optimal interrogation time satisfies 

1 

DT. 



This justifies the intuition that the dissipation and the information obtained 
from the synchronization should be proportional. 

The article is organized as follows. In a preliminary Section [2] we recall 
the basic theory of stochastic processes. In Sections [3] and |4] we describe the 
classical and the quantum estimation theories. In particular we derive a novel 
version of the Cramer-Rao bound that emphasizes the role of correlations 
between an unknown and its estimation. Our model of an atomic clock is 
fully described in Section |5} where we also derive the aforementioned bounds. 
In Section [6] we give an example where all bounds are saturated and in 
Section [7] we discuss the optimization of the clock's performance. We close 
our exposition with outlooks in Section |8} 

A comprehensive reference for atomic clocks is a book by F. Riehle [2T] : 
in particular the operation of the Cesium atomic clock that we sketched in 
the introduction is explained there in all details. Foundations of quantum 
estimation theory are described in a monograph by C.W.Helstrom [15] or 
in one by A. S. Holevo [16]. There are many recent results in quantum 
estimation theory related to (or directly applied to) atomic clocks, giving 
their full account and proper credit is beyond the scope of our work. We list 
works that most infiuenced the author during the writing. 

An accuracy of synchronization between a classical oscillator and a quan- 
tum reference was first derived by W. M. Itano et. al. [18], an improvement 
due to entanglement and a role of dephasing has a very clear exposition in 
a paper by S. F. Huelga et. al. [17]. A concise general view on a quantum 
metrology can be found in V. Giovannetti et. al. [13]. Furthermore a re- 
peated interaction model (e.g. [1]) has a similar setup since our model can 
be also viewed as a repeated interaction of a chain of quantum systems with 
a classical system. 
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The focus of our work is on an exposition of the model and a study of 
the aforementioned fundamental properties. In particular we do not aim to 
proof our statements under minimal conditions. 

Assumption 2 We assume that all functions appearing in the text are con- 
tinuously differentiable in an appropriate space and that all probability dis- 
tributions have a finite second moment. 



2 Stochastic processes 

We would consider a probability distribution p{9) of a single real parameter 
^ or a joint probability distribution p{6, 9') of two real parameters 6', 6' . The 
former is a reduced probabihty distribution of the latter, p{6) = J p{9, 9')d9'. 
Furthermore, associated to the latter there is a conditional probability dis- 
tribution of a single parameter. 



fp{9, 9')d9'' 

describing the probability of 9' given 9. 

For a probability distribution p{9) we denote by fi{p), (t{p) its mean and 
variance respectively, 

M ■■= J opmo, ^\p) ■■= fifpm9. 

The mean of a joint probability distribution p{9, 9') is the vector of means 
and its variance is a matrix of mutual covariances. 

Conversely (with a slight abuse of notation), we will often consider pairs 
of random real-valued variables 9,9' on a probability spac^ d/i}. This 
induces a join probability distribution p{9, 9') that reproduces expectations, 

9')] = j f{9, 9')p{9, 9')d9d9'. 

The random variable 9 by itself has a probability distribution p{9). If random 
variables are specified only by prescribing their joint probability distribution. 



^To simplify the notation we never spell out sigma- algebra explicitly. Those who care 
should be always able to fill it from the context. 
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then their usage would be independent of a realization (as a function on a 
certain probabiUty space). 

Crucial for estimation theory (and our work) is a notion of conditional 
expectation. A conditional expectation of 9 given 9' is a real valued random 
variable E[^|^'] on a probability space {Vt, d/x} given by 



Conditional expectation is a unique random variable measurable with respect 
to a sigma algebra generated by 9' (i.e. such that it is constant on the sets 
where 9' is constant) that reproduces expectations, 



The most useful instance of this formula is /(x) = 1, a conditional expecta- 
tion E[^|^'] has the same expectation as 9, 



A space of real valued random variables has a natural associated scalar 
product (^, 9') := K[99']. Random variables of finite variance equipped with 
this scalar product form a real Hilbert space. We refer to this scalar product 
whenever we speak about orthogonality of two random variables. 

A stochastic process is a collection of random variables; we will use both 
stochastic processes, Xt, in continuous time t > and discrete processes 
Xn, n e N. The first naturally describes frequency dependence on time, the 
second is a suitable description of measurements occurring in discrete time 
steps. We would also encounter integrated processes. 



These processes naturally occur as a relation between clock time and instan- 
taneous frequency. 

Below we consider only discrete processes in details. The corresponding 
concepts for processes in real time should be clear. 

Of main interest will be the mean and the variance of instantaneous fre- 
quency and the variance of the associated clock time. More generally we 




E[f{9')mo']]-nfim- 



(2) 



E[^] =E[E[^|^']]. 
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will frequently use quadratic quantities associated to the process X„. In 
particular its mean E[X„] and autocovariance 

C(X„+,, X„) = E[(X„+, - E[X„+,])(X„ - E[x^])]. 

For h = autocovariance reduces to a variance of the process at time n. 
Quadratic quantities of an integrated process might be computed in terms 
of integrated covariance. For completeness we give an explicit formula, 

n+h n n n n—j 

j=0 j=0 j=0 j=0 k=l 

n n+h—j 

+ E E c{Xj+k,Xj). 

j=0 k=n—j+l 

A process X„ is called stationary if a joint distribution of ■ ■ ■ Xn^+h 

is independent of h. In particular its mean, variance and autocovariance are 
independent of n, we denote 7(/i) := C(Xn+h, and 7(0) = o"^. The 
ratic[^C(^) ■= iW/cr'^ is known as a correlation function. The formulas for 
integrated stationary process simplifies by one summation, e.g.: 

n n n 

^(E^^' E ^^•) = + 1)^' + 2 E(^ - ^ + 

i=0 i=0 h=l 

The formula implies that for a stationary process X„ with zero mean, 
E[X„] = 0, we have 

, n 00 

hm -E[(J2x,r] = a' + 2j2lih), (3) 

i=0 h=l 

provided the sum on the RHS converges. In fact, under somewhat more strict 
conditions on 7(/i) the central limit theorem gives convergence of n"^/^ Xj 
to a Gaussian random variable of zero mean and variance given by the RHS 
of Eq. g. 

After an initial stage, a frequency source approaches a process that can be 
described as a mixture of a stationary process and a drift (also called aging). 

^In standard notation this would be denoted by p{h) however we shall need p to denote 
a quantum state. 
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If the latter is negligible, frequency source is a stationary process. Decay of 
the correlation function ({h) roughly describes stability of the source. 

A process {Xn} is a martingale if E[X„+i|X„] = Xn and it is Markov if 
the future depends on the past only through the present, E[X„_|_i|Xj, j < 
n] = E,[Xn+i\Xn]- The Markov property can be equivalently stated that 
past and future are independent given the present. This is the first part of 
following lemma. 

Lemma 3 Suppose that {Xi, X2, X3} is a Markov chain, then it holds 

E[XiX3\X2] = E[Xi|X2]E[X3|X2]. 

Furthermore when E[X3|X2] = CX2 for some C G M then 

E[XiX3] =CE[XiX2]. 

Proof: The first equation is the equivalent definition of Markov property as 
mentioned in the text above the lemma, see |Hl Chapter II. 6]. We prove the 
second part, 

E[XiX3] =E[E[XiX3|X2]] 

= E[E[Xi|X2]E[X3|X2]] 
= E[E[Xi|X2]CX2] 
= CE[XiX2]. 

In the first and last equality we used Eq. ([2]). □ 
We end this section with examples of various stochastic processes appear- 
ing in the following sections. 

Example 4 (Standard diffusion) White noise is a stationary process, Xf, 
of uncorrelated random variables. They have a constant mean fi and auto- 
correlation function 

j{h):=C{Xt+h,Xt)=D6{h). 

The integral of white noise, Bt = JqXs, is a Brownian motion. Its mean 
and variance are given by formulas 

E[Bt]=fit, C{Bt+h,Bt) = 2Dt. (4) 

A drift fi and a diffusion coefficient D are constants whose physical dimension 
([■]) depends on the process. More precisely, [fi] = [Xg], [D] = [Xg]'^. 
Brownian motion Bt is a continuous martingale. 
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Example 5 (Gaussian random process) A discrete process is called 
Gaussian if the joint probability distribution of X„2, . . . , X„^. is a mul- 
tivariate normal distribution for any j -tuple ni, . . .nj. 

The Gaussian process is completely determined by the means fi{Xn) and 
covariances C(X„, X^)- A particular property of interest (see |5], Ghapter 
7.3]) is that for stationary Gaussian processes with zero mean and variance 
it holds that 

where C = ^Xn+iXn]/a\ 

Example 6 (Exponentially decaying correlations) A stationary discrete 
stochastic process has exponentially decaying correlations if for some |C| < 1, 

The variance of the associated integrated process can be compute explicitly by 
summing a geometric series. Note that the result is consistent with Eq. 




N 1-C 



Xy^ + Oil). (5) 



3 Estimation theory 

An estimation theory studies strategies how to estimate an unknown phys- 
ical parameter (p based on a data collected from a single or multiple mea- 
surements. In the classical estimation theory there is usually a one to one 
correspondence between ideal measurement and the unknown. The problem 
is then to decrease a measurement error using large data sets. 

Our exposition of the estimation theory would be directed towards ap- 
plication in atomic clocks. A reader can find a general reference in e.g. |7], 

We examine strategies to estimate a parameter G M based on a measure- 
ment outcome a. We assume that the space of outcomes, A^, is a probability 
space equipped with a measure da. Measurement outcomes a{(f) are a fam- 
ily of random variables with probability distributions p{a\(p); a conditional 
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probability distribution of an outcome a given (p. The estimation strategy 
is then defined by an estimator $. Upon a measurement outcome a a guess 
$(q;) is made. $ is a function from the space of outcomes to real numbers. 

In a Bayesian approach to the estimation theory 93 is a random variable 
with a certain prior probability distribution q{(f). It is then convenient to 
view its estimation, (p, as a random variable. 

Definition 7 (Estimation) Let (p be a real valued random variable on a 
probability space {Q, dfi), a{-) be a family of measurement outcomes and $ 
an estimator. Then an estimation (p of cp is a random variable 

ip := ^ o a{ip). 

In explicit terms, this is a random variable 

ip : {fl® M,di^{x) ^p{a\ip{x))da) 

given by 

(p{x, a) — 

It is common to denote the estimator $ and the estimation ip by the same 
letter. This is indeed convenient if (p is fixed with a given prior distribution. 

However we will consider estimations of a chain of random variables based 
on a fixed estimator $. For that reason we prefer to stress in our notation 
that ip depends on the random variable that is estimated while $ is a fixed 
function. 

Unbiased estimators play a central role in the estimation theory. 

Definition 8 (Unbiased estimation) Fix a family «(•). We say that an 
estimator $ is (-biased if for all random variables (p 

E[ip\p,] = (1 - Of. 

The estimator is unbiased if it is 0-biased. We also say that an estimator is 
conditionally unbiased if 

E[(p] = =^ E[ip] = 0. 
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A C-biased estimation (for C 7^ 0) is not a common concept, in fact 

a (^-biased estimator is proportional to an unbiased estimator; however the 
parameter ( will play an important role in our description of an atomic clock. 
Note also that {(p, (p + Qtp} is a martingale, E[<^ + = tp. In particular 

we will often use that for C-biased estimation 

nviv - (1 - 0^)] = 0. (6) 

The C-biased property of estimation can be equivalently stated by refer- 
ring only to the conditional probability distribution p{(p\ip). Consequently 
we often say that (.p is a ^-biased estimation of (/?, i.e. this is an estimation 
of the unknown if based on a (^-biased estimator. 

The following lemma summarizes various useful statements about unbi- 
ased estimators. 

Lemma 9 Fix a family a(-) of measurement outcomes. For an estimator $ 
the following is equivalent 

(i) $ is conditionally unbiased, 

(a) there exists ( EM. such that $ is (^-biased estimator. 

Suppose in addition that C 7^ 1- Then a (-biased estimator $ has the form 
$ = (1 — C)'^*o where $0 is an unbiased estimator. 

Proof: (i) =^ (ii): Let p{(p\(p) be a conditional probability distribution of 
(f given (fi and let q{(f) be a probability distribution of (p. Then (i) states 
that for all distributions q{(p) with zero mean it holds 

J #(^lv')g(v^)d^d¥? = 0. 
A standard variational argument implies 

for some C M. This is exactly (ii). 

(ii) =^ (i): For a random variable (p with zero mean and ^-biased 
estimation (p it holds 

E[^] = E[E[^|^]] = (1 - C)E[^] = 0. 

When $ is a ^-biased estimator and C 7^ 1 then (1 — C)~^^ is clearly an 
unbiased estimator. □. 

Most of the work in estimation theory is centered on minimizing certain 
cost of ip — (p not hitting zero. We discuss this in the following section. 
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3.1 A cost of the estimation 

A cost of the estimation (i.e. a functional we aim to minimize) is given by 

Cost = E[{ip - if)^] 

= j - ^{a)fp{a\^)q{v)daA^, (7) 

where q{'-p) is a prior probabihty distribution of (p. The choice of the cost 
function is to a large extent arbitrary. The quadratic cost function is distin- 
guished by its simplicity and a direct relation to variance, the quantity that 
is most suitable for a description of time precision. 

It is well known how to optimize the cost, Eq. ([T]), with respect to the 
estimator $ for a fixed prior distribution of the variable 99. 

Lemma 10 (Optimal estimator) Fix a conditional probability distribu- 
tion p{a\ip) and a prior distribution q{ip). Then an estimator 

$(a) = E[ip\a] (8) 

= j ^pi^\a)dip 

minimizes the cost ^ with respect to the estimator $(■). 

Proof: We use the formula E[Z(a)y] = E[Z(a)E[y |a]] twice to rewrite 
the cost as 

E[($(a) - (p)^] = E[$(a)2 - 2$(a)(^ + ^'^] 

= E[($(«) - E[^\a]Y] + E[(E[^|«] - ^f]. 

The last expression is a sum of two squares, where the second is independent 
of $. Hence the minimum is achieved when the first square vanishes. □ 
The explicit expression, Eq. ([s]) is often hard to analyze. This is the case 
when the conditional probability distribution p{a\ip) has an analytical ex- 
pression, however there is no such expression for the conditional probability 
distribution p{ip\a). In such cases bounds of the cost from below are very 
useful. Of such bounds the most famous is the Cramer-Rao bound, a vari- 
ant of which we present here. It bounds the cost from below using Fisher 
information. This is a point-wise quantity that (roughly speaking) measures 
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how fast is a conditional probability distribution changes with the value of 
the condition. 

The Fisher information, F{ip), associated to a family of measurement 
outcomes a(-) (or probability distributions p{a\ip)) is given by 

Fi'^) := j (^-^ log p{a\^)^ p{a\^)da. (9) 

An important property of the Fisher information is that it decreases by pro- 
cessing of the information. The Fisher information associated to the family 
$oa(-) is always less than equaj^to the Fisher information associated to the 
family a(-). This fact allows to simplify the notation in the proofs below: 
We average over the conditional probability distribution p{0\ip) and keep in 
mind that the Fisher information of Eq. ^ satisfies 

F{^)> j (^^\ogp{^\^)^ P{^\^W. (10) 

The original Cramer- Rao bound (that we present in an integrated version) 
is the following statement. 

Proposition 11 Suppose that (p is an unbiased estimation (i.e. an estima- 
tion based on an unbiased estimator) of a random variable if. Then 

E[{^-0Y]>E[j^]. (11) 

Proof: For an unbiased estimation a conditional probability, p{ip\{p), of (p 
given ip satisfies 



We differentiate the expression, subtract zero and use the Cauchy-Schwarz 
inequality 

1 = ( / ((^ - <p)d^p{^\yD)d^ 
< F{^) I {0 - iffpi^l^W- (12) 



^In fact the equality holds if and only if $ o a is a sufficient statistics for (p. 
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Dividing by F{lp) gives a pointwise version of the inequahty, Eq. (11 ) can be 
then obtained by applying E[-] to both sides. □ 
An immediate corollary is a bound for (^-biased estimation. 

Corollary 12 Suppose that is a (-biased estimation of a random variable 
(f . Then 

E[(^ - ^f] > (1 - C)'E[-^] + C^E[v.^]. (13) 

Proof: For ^ 7^ 1 an estimation ip/{l — Q is unbiased and the statement 
follows from 

E[(^ - m = (1 - crniv - Y^-^r] + en^'] 
>(i-cm-^]+cw], 

where the equality in the first line follows from orthogonality of {1 ~ Qip — 
and if, see Eq. 

The case ^ = 1 is somewhat speciaQ In view of E[y9<^] = it then holds 

and we see that the optimal estimation is = 0. □ 
Van Trees [22j proved a Cramer- Rao type bound for an arbitrary estima- 



tor. We give a version of this bound, Eq. (16), that generalizes Eq. (13) and 
which to the best of our knowledge is new. It recognizes a role of correlations 
between ip and in the Cramer-Rao inequality. 

An extension of Cramer- Rao inequality beyond unbiased estimators comes 
at the expense of a less natural averaging of the Fisher information or in- 
troduction of additional terms. We choose the former approach because it 
has the simplest proof and gives the nicest formulas; A note on the other ap- 
proach would be given elsewhere. For a given probability distribution q{(p) 
we introduce an average Fisher information 

.(.,|f.., ,14) 

where 

r°°(s — ii(q))q(s)ds 
«(^) = aZ" ■ '''' 



*And completely unimportant. 
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For simplicity of the exposition we assume in the following theorem that 
if has zero mean. This is also the only case we will use in the article. 

Theorem 13 Fix a family of measurement outcomes a(-) and let (p he an 
estimation of a random variable (of zero mean) with a prior probability 
distribution q{ip). Denote ( := E,[{ip — 0)ip]/E,[Lp'^]. Then it holds 

E[(v.-^)^]>(i-C)'i + CW], (16) 



where F is the average Fisher information, Eq. (14), associated to the family 



Proof: The definition of ( implies that variables {1 — Q(f — and ip 
are orthogonal with respect to a natural scalar product. This suggest (and 
proves) a decomposition 

E[((^ - ^)2] = E[i0 - (1 - cwr] + en^']. in) 

Now we bound the first term on the RHS. Using the definition of ( once 
again we have 

(pp{0\Lp)(pq{(p)d(pd0 = (1 - C)E[¥?^]. 



It follows by integration by parts that for any a G M (the term proportional 
to a is point-wise zero) 

{0 - a^)-g^p{v\v) J sg(s)dsd(^d(^ = (1 - C)E[(^^]. 



This further implies (note a definition of g, Eq. (15)) 



<E[{0-avf] I ^(^)^d^ 
< E[{0-aif)^]F. 



Inserting this into Eq. (17) proves the sought inequality. □ 
The inequality, Eq. (l6), naturally bridges between the classical Cramer- 
Rao inequality for an unbiased estimator and a global Cramer- Rao inequality. 
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To see this note that minimizing over ( on the right hand side gives us a clone 
of Van Trees inequahty (see [H]), 



E[(^ - ^)^] > inf ^{i-crj+en^'] 
1 



F + l/E[(^2 



On the other hand C = reproduces Eq. (11 ) up to different averaging of the 
Fisher information. 



The special averaging of Theorem 13 is very suitable for Gaussian prior 
distributions. 



Example 14 Suppose that the prior distribution q{ip) of a random variable 
(f is Gaussian, then F is an average Fisher information with respect to the 
distribution q, i.e. 

F = E[F{^)]. 

Note however that by the Jensen inequality t 



El 



> 



FiifY - E[F(v9)] 



and so if an estimator is (^-biased the Cramer-Rao inequality (13) gives better 



bound than (16) even in this case. Inequalities coincide only if we further 



assume that the Fisher information F{(p) is constant. 



Proof: One can directly verify that q{(p) of Eq. ( 15 ) associated to a Gaus- 
sian distribution q{(p) satisfies q{ip) = q{(p). □ 



4 Quantum estimation theory 

In contrast to the classical estimation theory, quantum measurements cannot 
distinguish between non-orthogonal states even in the ideal situation of no 
external noise. This gives a fundamental bound on estimation precision which 
is referred to as the Heisenberg limit. Unlike the classical case, the probability 
distribution of this intrinsic quantum measurement error is described by the 
theory itself. 
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Throughout the text we fix a Hilbert space "H of, possibly infinite, dimen- 
sion A^. A state, p, on H is a positive operator of unit trace. A pure state 
is represented by a one dimensional projection, which we mostly denote by 
P. A POVM measurement is defined by operators Il{a) > that decompose 
the identity, /n(a)da = 1. The outcome a, given the state p, is a random 
variable with a probability distribution tr(pn(Q;)) with respect to a (possibly 
singular) measure da. As before, the space of outcomes is a probability 
space. 

We examine strategies to estimate a parameter G M of a quantum state 
= p{ip), whose dependence on the parameter Lp is known. The estimation 
strategy is defined by a POVM measurement Tl{a) and an estimator $(a). 
A POVM measurement n(a) induces a conditional probability distribution 
of measurement outcomes p{a\(p) = tT{Il{a) p{(p)) and hence for every fixed 
POVM measurement we obtain a well posed classical estimation problem. 

We call a couple {n(a), $} an estimation strategy. Given such a strategy 
and a real valued random variable we defined (Definition [?]) its estimation 
0. All associated definitions generalize in a straightforward manner, e.g. an 
estimation strategy is (^-biased if for all random variables (p it holds that 

E[(^|y.] = (1 - Ov^. ^ 

Now let {n(a),<^} be an estimation strategy. Then a conditional proba- 
bility distribution function p^iflf) of conditioned upon (p is given by (we 
assume that |V$| > and use a coarea formula) 

P{0\v)= tT{p{ip)U{x))\V^{x)\-^du{x), (18) 

J<i'-i((^) 

where du is the induced measure by da on the manifold ^~^{ip). In particular 
we see that a POVM measurement with outcomes G M given by 

n((^) = / n(a;)|V$(a;)|~Mz/(x) 

and an identity estimator function is equivalent to the original pair {n(a), $}. 
The equivalence of these two pairs can be also explained in a down to earth 
language: The label a of the measurement outcome is a superficial quantity 
and we can always re-parameterize it so that the measurement outcome is 
the estimation itself. In particular tT{Il{0) p{(f)) is a conditional probability 
of if given (p. 
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For an exposition of the subject it is more convenient to use the original 
pair {n, $}; however inside proofs we sometimes assume that without loss 
of generality the estimator is the identity function. 

The cost of the estimation is now given by, cf. Eq. ([T]), 

Cost = E[(v9 - 0)^] 



where q{if) is a prior distribution of a random variable (f. 



For a fixed POVM measurement n(a) Lemma 10 describes optimization 
of the cost with respect to the estimator $(•). It is natural to further try to 
optimize the cost with respect to the measurement POVM n(a). This opti- 
mization problem can be also solved algebraically (see Chapter 8.1.(d)] 
and also [6J) and once again the explicit expression is hard to analyze. There- 
for we discuss bounds from below. 

The quantum Cramer-Rao bound is a generalization of the classical one. 
It bounds the cost from below using the (quantum) Fisher information, which 
is a point-wise quantity that (roughly speaking) measures how fast a family 
of states p{ip) is changing at a given point. 

The Fisher information F{(p) is given by the expression 

F(y,)=tr(p(^)X(y,)2), 
where X{(p) is a solutioiij^of an equation 

^{X(^),p(^)} = p(^), (■ = A). 

When p{(f) = P{(f) is a family of projections then X{(p) = P{(f) and Fisher 
information is proportional to the Fubini-Study metric, F{ip) = 2tr(P(<^)^). 
Also note that a Fisher information -FV(v^) associated to a family p{Tip), 
(r G M), satisfies 

F.(VP) = T'FiTif). (19) 
In parallel to the classical case we define F with respect to a probability 



distribution q by Eq. (14). 



^The equation does not determine QX{(p)Q, where Q is the orthogonal projection on 
Ker(p((/3)), and this part of X(ip) can be chosen arbitrary. 
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Braunstein and Caves [3] give a connection between the classical Fisher 
information associated to a fix measurement 11 (a) and the quantum Fisher 
information. 

Proposition 15 Consider a family of states p{ip) and POVM measurements 
n(a). Let F{ip) be the quantum Fisher information associated to the family 
p{ip) and let Fjji^) be the Fisher information, Eq. associated to the 
conditional probability distribution 

p{a\^) = tr(n(a)p(^)). 

Then it holds that 

F{(p) = supFn(v?), 
n 

where the supremum is taken over all POVM measurements n(a). 

Proof: Let X be a hermitian operator and A, B non-negative operators, then 
the following inequality holds true 

|tr {X{A,B})\^ < Aii{AB)ii{AXBX). 

Note that {A, B} does not need to be a positive operator and so we cannot 
apply the Cauchy-Schwarz inequality on the LHS immediately. However we 
can do it after expanding the anti-commutator and in this way we get the 
RHS. 

We use the inequality for X = X{ip), A = p{ip) and B = Il{a), (we omit 
the arguments of the operators) 

|tr(np)|2 = |tr(n{X,p})|2 = |tr(X{n,p})p 

< 4tr(np)tr(XpXn). 

Hence we have the following estimate for the classical Fisher information, 

/■ (tr(H(a)p(^))^ ^ 
tr(H(«)p(v.)) 

< j 4tr(X(v9)p(y.)X(y.)H(«))d« = F{^), 

the last expression being the quantum Fisher information. Equality can be 
achieved by taking H(a) as a spectral decomposition of X{{p). □. 
All versions of a quantum Cramer-Rao bound are then immediate corol- 



laries. We present one as an example, which is a compilation of bounds ( 13 ) 
and (16). 
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Theorem 16 Consider a family of states p{(f) and let F{(f) be the asso- 
ciated quantum Fisher information. Let (p he an estimation of a random 
variable if (of zero mean) with prior distribution q{(p) and denote ( := 
E[{ip - if)ip]/E[ip^]. Then zt holds 

E[((^-^)^]>(l-C)'i + CW], (20) 

r 



where F is an average Fisher information, Eg. (14)- 

If furthermore (p is an unbiased estimation then the term 1/F in the 
inequality can be replaced by a simple average E[l/F((p)]. 



5 A model of atomic clocks 

The main variable of our model is an instantaneous frequency error ^{t). It is 
a real valued random variable related to a frequency of the clock's oscillator, 
u{t) = uq + f(t). In this section we define the process {ip(t)}t>o- The clock 
time is then obtain by an integration, 

1 /•* 

tciock = — uj{s)ds. 
1^0 Jo 

The model consists of various parameters/objects that determine the pro- 
cess; we list them here: 

• Two time scales, T > r. T is a time between two consecutive synchro- 
nizations and r is an interrogation time. 

• A Markovian stochastic process Kfip that describes evolution of the 
error in absence of synchronization given an initial condition Kqi^ = ip. 

• A family of states p{ip) and an estimation strategy {n(a),$}. These 
objects describe the synchronization. 

The adjustment of error after the synchronization is performed periodi- 
cally at times nT, G N. We denote 

^^{t):=^{nT + t), for t G [0, T), 
1 

iPn-= - / ^n{s)<^S. (21) 
'T Jt-t 



24 



We also abbreviate ipn '■= V^n(O) = ip{nT). The stochastic process ip(t) is 
defined imphcitly by an initial condition ipo and equations 

Mt)=Kt^^ for te[0,T), (22) 

V^n+l = Kxipn - 0n- (23) 

The random variable appearing in the last line is an estimation of <fn- The 
estimation is obtained by a measurement n(a) on a state pi^npn), employing 
the interrogation timescale r. 

Note that by definition, (p{t) is right continuous at jumps occurring at 
the times t = nT. With a small abuse of notation we denote the left limit 
by (fn{T), i.e. V5„(T) := /^TV^n. 

Definition 17 We call a triple {p{f), n(Q;), $} an atomic clock. T, r and 
Kt are locally constant and will be clear from the context. A solution ip(t) of 



Eqs. (22), (23) is called a state of an atomic clock. 



Remark 18 We stress again that the estimation is obtained by the measure- 
ment on the family p{T(f). Whenever we say that the estimation procedure, 
{Il{a), $}, of a clock is (^-biased, we mean that it is (-biased with respect to 
the family p{T{p). This amounts to a factor in the estimator, cf. Sec- 
tion o 



Eq. (22) describes the evolution in absence of synchronization, Eq. (23) 
describes jumps due to the synchronization and the corresponding adjust- 
ment of frequency. The latter equation defines a (sub)process tpn- This is a 
Markovian process that encodes the synchronization and hence has a distin- 
guished role. 

Definition 19 (Stationary state of a clock) We say that (f{t) describes 
a stationary state of an atomic clock if is a stationary process. 

A stationary state, </?(t), of an atomic clock is T periodic, meaning that the 
joint probability distributions of V5(ti), . . . , ^{tn) and ip{ti+T), . . . , ipitn+T) 
are identical. This in particular implies that an integral of a stationary state 
over the period T, 

T 



6n := / ^n{s)ds 

Jo 



is a stationary process. 
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We aim to study a situation when a clock time is unbiased, E[tciocfc] = 
t + 0(1). In terms of the error (p{t) this means 



It is natural to require that the integrand be zero pointwise. 

Definition 20 (Unbiased stationary state) We say that a clock has an 
unbiased stationary state (f{t) ifK[(f{t)] = 0. 

Whether a given clock has an unbiased stationary state is not a robust 
statement. It is sensitive to the noise and to the choice of estimation 
strategy. A natural question is under which conditions on Kt and p{(f) we 
can find an estimation strategy {11,$} such that the clock has an unbiased 
stationary state. We do not know any general answer to that question and 
rather choose to assume more about the clock. 

We require two additional properties. First, that the frequency error 
remains unbiased provided the initial error ipo is unbiased, i.e. E,[(po] = 0. In 
Lemma [9] we proved that this can happen only if the estimation strategy is 
^-biased for some C ^ I^- We further require that the state is asymptotically 
unbiased for any initial error ipo, this restrict the parameter to \(\ < 1 (see 



Definition 21 (Unbiased clock) We say that a clock is unbiased if Kt(p 
is a martingale and the estimation strategy {11,$} is (-biased (with respect 
to the family p{Tip) ) with \(\ < 1. If a value of ( is given we say that a clock 
is (-unbiased. 

We recall that an estimator $ = $^ is ^-biased if it has a form $^ = 
(1 — O^oy where $o is an unbiased estimator and C is a real number. In 
particular it holds 



This implies that an unbiased clock satisfies E[(/j(t)] = provided the initial 
state is unbiased, K[ifo] = 0. Moreover any stationary state of an unbiased 
clock is unbiased. 

We aim at developing a clock model in an appropriate level of generality; 
however in the sequel we pick a particular noise Kt. 




Eq. (24)). 



E[v9-<^c] = CEM. 



(24) 
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Definition 22 (Gaussian clock) We say that a clock is Gaussian with a 
diffusion constant D if 

Ktip = (p + Bt, 

where Bt is a Brownian motion with zero mean and the diffusion constant 
D. 

It pays to examine a clock without noise (a Gaussian clock with D = 0). 
We do that in the following section. Afterwards we study properties of a 
stationary state of an unbiased Gaussian clock. 



5.1 A clock without noise 

It is rather surprising that an important feature of clock operation can be 
demonstrated in a trivial case Kt = l. The stochastic process (p{t) simplifies 
significantly. The frequency error ip{t) is constant in the intervals (riT, {n + 
1)T) and jumps on its boundary. Its value inside the interval was denoted 
by ifn and the jump at the right side of the interval is = <^n- Eq- (23) 
takes a form 



The time scales T, r have somewhat artificial role in this case. The time 
scale T appears only through scaling of a family of states, p{T(p), upon which 
the estimation is based. This leads to a factor in front of the Fisher 
information. However this is also an essential factor to recover the correct 
physical units. 

The clock time associated to a give a state ipit) is given by 

1 /•* 

tdock - t = / ip{t) 



^0 Jo 

= -Y,Vn. (25) 

We claim that the variance of the clock time has a universal bound, 
although the variance of the frequency error can be arbitrary small. 

Theorem 23 (Unbiased clock without noise) Suppose that Kt = 1 and 
that Cc_ = {p{<^), 11(^9), $(^} is a family of unbiased atomic clocks; = 
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(1 — C)'^'o being a (-biased estimator. Let f(^{t) be a stationary state of the 
clock Cq, then 

The variance of clock time associated to ^(^{t) is ( independent and satisfies 
a bound, 

where F{ip) is the Fisher information associated to the family p{(p). 

Proof: Fix ( and denote ipn '■= V^c(^^)) n E 'N. Then y9„ is a station- 
ary process with zero mean and variance cx^ := Efv?^]- The Cramer- Rao 
inequahty, Eq. (13), then imphes 



The inequahty (26) foUows by solving for a . 

We claim that ipn is a Markov chain with exponentially decaying correla- 
tions 

Then acccording to Example |6] the variance of the clock time is 



E[(Wfc - = t^a2i±| + 0(1). 



Plugging in inequality (26) one obtains the bound (27). 

Exponential decay of correlations follows from the unbiasedness condition, 

which by Lemma [s] implies that ioi h>l, ¥\ipn+h^n\ = C,^['^n+h-iVn\- D 

In Section [6] we will see an example where all bounds in the theorem are 
achieved. The moral to be taken is that there is a 1-parameter family of 
clocks whose stationary states differ in autocorrelations, however giving an 
equally good clock time. 
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We believe that the bound ( [27| ) should be valid without assuming that the 
clock is unbiased. Instead only a certain ergodicity assumption to prevent a 
trivial counterexample of no synchronization^ should be made. We were not 
able to find a proof of any such statement. 

To be more concrete, let be an unbiased stationary state and de- 



note ( := K[ipn+i(Pn]/^['^n]- Then the inequality (26) holds true in view of 



Eq. (20), but we cannot anymore conclude (27) without additional assump- 



tions. Eq. (27) holds if we (for example) assume a detailed balance of (pn, 



however we do not see any physical motivation to do so. 
5.2 An unbiased clock 

In this section we show that for an unbiased atomic clock we can explicitly 
relate all autocorrelations of ipn+h(t) and f nit') to quadratic quantities re- 
lated to <fn{t') alone. In particular for a stationary state of the clock we can 
compute all autocorrelations provided we know the autocorrelation functions 
of Ktip. 

Proposition 24 Suppose that ip{t) is a state of a (^-unbiased atomic clock. 
Then for any integer h > 1 and t, t' G [0, T) its autocorrelations are given by 
a formula 

E[ipn+h{t)Mt')] = C^^^'niMT) - (1 - C)Vn)Mt')]- (28) 

We recall that ^n{T) := Ymvt^T Vn{t) o,nd that is an time average defined 



in Eq. (21) 



Proof: Since (pnit), t E [0, T], is a martingale we have by Lemma [3| 

E[ipn+h{t)Mt')]=^¥>n+h{0)Mt% (29) 
Now recall that a definition of ^-biased estimation and Lemma |3] gives 

E[<^„+h¥'n(t')] = (1 - (MVn+hMt')]- 

The latter relation then implies 

E[(pn+h{0)ipn{t')] = E[{(pn+h-l(T) - 0n+h-l)<Pn(t')] 

= E[{(pri+h-liT) - (:pn+h-l + C'^n+h-l)Vnit')]. (30) 



'A process fn+i = fn with initial conditions (po = 0. 
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For h > 2 this further simphfies. Note that in view of Eq. (29) we have that 
for such h, K[{ipn+h-i(T) — '^n+h-i)'^n(t')] = 0. Hence it holds 

E[ipn+h{0)(pn{t')] = CE[(^„+fe_iV?n(t')] 

The last equation can be used recursively until we obtain only autocorrela- 
tions between and fn{-)- Such autocorrelations can then be expressed 
by Eq. (Q, leading to Eq. g. □. 

The proposition also allows to compute autocorrelations of the integrated 
error 



where ip{s) is a state of ^-unbiased atomic clock. Without computation 
details we give the formula for future reference, 



T-T 



E[^„,(s)2]ds 



1 - C)C-'TrE[ipi], ih>l). (31) 



5.3 Gaussian clock 

Here we discuss a partially solvable model in which the evolution in between 
measurements is that of a Gaussian random walk. Rather than a fundamental 
model for frequency noise of a classical oscillator this should be considered 
as a generic short time model with one free parameter, the diffusion constant 
D. For long times this model is indeed unrealistic since frequency should be 
described by a stationary process. 

Here is a detailed description of the noise. Within the interval (nT, (n + 
1)T) the frequency noise (p{t) behaves like a random walk with diffusion 
constant D and initial condition ipn- This means that (fn{t) = (fin + Bt, 
where Bt is a Brownian motion of zero mean and a diffusion constant D, see 
Eq. (|4|. Brownian motions in different intervals are independentj^ 

Since we are describing the diffusion of frequency the dimension of D is 
that of time~^. 

^In fact the Brownian motion should come with an index n which we omit for a sim- 
phcity of notation 
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We derive asymptotic bounds for a stationary state of a Gaussian clock. 
As before there is a 1-parameter family of clocks parametrized by unbiased- 
ness (. 

Theorem 25 (Gaussian Clock) Suppose that = {p{^), n(a), is 
an unbiased Gaussian atomic clock with a diffusion constant D; = (1 — 
C)$o being a (-biased estimator. Let (p(t) be a stationary state of the clock 
Cc_, then 

n^'^n > — ^ + — ,(c, -), (32) 

^(C, x) = Q + X. 

Moreover the associated clock time variance satisfies 

E[(W - tf] > ^5 + ^(f^^(^' ^ + (33) 

where 

/(C, x) = 1 + c + + (1 + 20(1 - C)x + (1 - V. 

Above 1/F is a shorthand for E,[l / F{Tipn)]- 



Proof: We follow proof of Theorem 23 (the case D = 0) only the details 
are more involved. We denote cr^ = ^[fn] ^ variance of a stationary process 

Throughout the proof we perform calculations of various autocorrelations 
related to the process (pn{t). We do not provide details of such calculations. 
Here are the first three formulas we shall use: 

E[(v5„(T) - (^n)<^n] = t^Dt, 
E[{^4T) - ^„)'] = Idt, 

E[^l] = a' + 2DT-^DT. (34) 
Now we derive the bound for the variance of the stationary state, 
= E[ipl^,] =E[((^„(T) - (^„) + ((^„ - (^„))'] 

=E[((^„(T) - (^„)'] + 2CE[((^„(T) - + E[((^„ - 

= '^DT+'^DTC + E[{ip^-0^)% 
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Using inequality (13) on the last term above we then have 

1 



2DT - -Dt) 
3 ^ 



(35) 



Eq. (|32j) follows. 

The clock time can be expressed by a stationary stochastic process 9„ as 
(we pretend that t/T is an integer, the difference results in the factor 0(1) 
appearing in Eq. (33)) 



^ clock 



-t 



t/T 



The variance of the clock time is then expressed through autocorrelations of 
the process 6„ by the formula (|3|. Autocorrelations for a general ^-unbiased 
clock are given in Eq. (31). For a Gaussian clock we can compute all the 
involved autocorrelations and integrals. This gives, 



ml 



rV^ + -DT^ 

3 



3 3 



h>l. 



For the variance of clock time this implies. 



^[{tciock — ty 



1 + C 9 2 „ 2D 1 , ^ . 



1-C 3 



The bound (33) in the theorem is then achieved by plugging Eq. (32) and 
doing straightforward algebraic manipulations. □. 

We postpone an analysis of the bounds, in particular their dependence 
on ^, r and T to the Section [7] and proceed to give an example where all the 
bounds are achieved. 



6 Gaussian families 

Here we aim to illustrate our concepts on a simple solvable example. 
Let p{({)) = \ip{(p)) {4'{^p)\ be a family of Gauss states on a real line, 
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Our notation highlights the Fisher information. It is a matter of simple 
computation to find that the Fisher information, F{ip), of p{ip) is indeed 
constant and equal to F. 

We consider an estimation strategy {n(a;), <l>^}, where n(a) is an orthog- 
onal decomposition of a position operator, X, on a line, 

X = J all{a)da, Il{a) = 5{x — a) 

and $(^(q;) = (1 — Qa/r. The factor in the estimator makes the esti- 
mation strategy ^-biased with respect to the family p{Tip), see Remark 



The conditional probability distribution of an estimate (p given parameter ip 



is then, c.f. Eq. (18) 



exp —(^-(1-0^)2 



"V 2(1-0^ 

We see that •p[(p\^p) is a Gaussian kernel. For ^ = it is a symmetric heat 
kernel and hence unbiased. In general the estimator is multiple of unbiased 
estimator and the estimation strategy {11, is ^-biased. This can be also 
checked by direct integration of (p with respect to the kernel. 

Now we consider a Gaussian clock {p(v5), H, and its state ^ii). We 
recall that being Gaussian means that '^rSfy = fn + Bt, where Bt is a Brow- 
nian motion with a diffusion constant D. We claim that for such a clock 
the Markov chain ipn is Gaussian. We show this directly by computing the 
transition map T{ip, ip') = p{{pn+i = (p\(Pn = v')- 

This map can be computed considering a joint probability distribution 
ipn{T)\ipn)- It is a binomial Gaussian distribution with mean p = 
{^Pni (fin) and a covariance matrix independent of (pn, whose elements might 



be computed in a standard way, cf. Eq. (34). The transition map is then 
given by (recall that (pn+i = 'PniT) - (pn) 



T{ip, ip') = I p{<p„ = X-ip, iPn{T) = X\ipn = 'P')dx 

p{(Pn = X - ip\ipn = y)p{^n = V, ^n{T) = x\(pn = (p')dxdy. 



33 



An integral of Gaussian kernels is itself a Gaussian, proving the claim that 
i{>n is a Gaussian process. 

The transition map can also be computed explicitly. One can either 
compute the involved Gaussian integrals or read the outcome form the com- 
putation in the proof of Theorem 33 (RHS of Eq. (35)). Either way one 
arrives at 

T(^,^0 = ;^exp(-^(V.-Cv^')^), (36) 

= ^ ^ , + C'2DT + -Dt{\ + C - iC). 

Gaussian states are determined by their mean, /i, and variance, . If 
we represent them by a column vector {fi, a^)^ then T is an affine operation 



T 



^2 I - 1 ^2^2 + ,2 



It is then easy to determine a stationary Gaussian distribution, it has zero 
mean and a variance satisfying equation cr^ = C^cr^ + s^. This gives 



This is exactly the RHS of the bound (32). Saturating this bound it also 
saturates the bound for the clock time. We summarize: 

Theorem 26 Let = (p(<y9), 11, $^) be a Gaussian clock described above. 
Then possesses a Gaussian stationary state (t) with variance given by 

9(C.i^) = C' + ^^iiC_^. (37) 

The associated clock time has a standard diffusive behavior 

E[(W-t)'] = 2Dt + 0(l) 
with a diffusion constant 



^^=^(?;5 + 3^r^l, (38) 
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where F is the Fisher information of a family p{(f) and 

/(C, x) = 1 + c + + (1 + 20(1 - Ox + (1 - V. 

Remark 27 There is a reason for saturation of bounds: In the Gaussian 
the Cramer-Rao bound, Eq. (IS), is saturated, because condition for 



case 



equality in Cauchy-Schwarz in Eq. ( 12} is met. In fact this proves Theorem 26 
without any computation, however we believe that the explicit computations 
that were presented in this section complement a rather abstract approach of 
previous sections. 



7 Optimization of an atomic clock operation 

A practical reason to study atomic clocks is an optimization of their perfor- 



mance. We have seen in the last section that bounds (32), (33) are attainable 
and now we proceed to optimize the error of clock time and the variance of 
frequency as obtained in the last example. Alternatively this can be viewed 
as optimizing the bounds. It is clear that increasing the frequency Uq or 
the Fisher information F (respectively decreasing the dissipation D) would 
improve the performance. Therefore we fix these quantities and optimize 
the expressions with respect to r, T and C,. For simplicity of exposition we 
discuss only the case T = t. 



We are going to minimize the diffusion of clock time (multiplied by 



0)^ 



2u',D = ^ + 2DT'^^^^, (39) 

and a frequency variance of a stationary state of the clock after a synchro- 
nization 

Perhaps a word why to optimize the latter, the variance of frequency, is 



needed. In bounds (32), (33) there appears an average Fisher information 
over a stationary state of the clock. Suppose momentarily that the Fisher 
information is not constant. It is then reasonable to believe that realistic 
F[(p) has a maximum at and decrease with increasing \ip\. Therefore an 
operation with a small variance of the frequency is preferable. Furthermore 
the variance of frequency should be related to an entropy production of the 
clock, smaller the variance smaller the entropy production. 



35 



We first optimize Eq. (39) with respect to the interrogation time T and 
then we obtain the optimal ( from Eq. (40). This is clearly an ad- hoc choice, 



however there is no pair (C, T) that would jointly minimize both equations. 
For fixed ( the optimal time T satisfies 



4DFT^ = (1 - C)^. 



(41) 



For this optimal time T minimizing Eq. (40) leads to a third order equation 
that can be solved numerically and gives the optimum ( ^ 0.35. At this 
optimum the diffusion of clock time and the variance of frequency are given 
by 



^ 16 



2 \ 1/3 



Note that Eq. (41) justifies the intuition that the optimal interrogation 



time is determined by a balance between dissipation DT and the gained 
information 1/(FT^). 



8 Outlooks 

Here we want to outline further problems that are in our opinion both im- 
portant and attractive. 

Towards a realistic atomic clock 

At least two points need to be addressed in order to have a realistic model 
of an atomic clock: 

• Our quantum states are mere caricatures. One should use states of 
spins, e.g. 

p(rip) = e-'^'P" \N spins) {N spins] e'^'^" , (42) 

where H = ai^-* + cri^-* + ■ • • + cri'^\ However there is a hidden catch, 
such families are always 27r/r periodic in ip and there cannot be any 
stationary state of the clock (the stationary probability distribution of 
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such state would have the same periodicity and hence it wouldn't be 
normalizable) . A coarse graining procedure or a scaling limit needs to 
be introduced. 

• One should go beyond Hamiltonian dynamics. Although our results do 
not require p{(p) to be a unitary family, they are phrased in terms of the 
Fisher information and they offer no direct understanding on how the 
decoherence timescales of the quantum reference influence operation of 
the atomic clock. This extension should be more or less straightforward 
as a good understanding of the Fisher information for mixed states 
and the quantum estimation in presence of a noise is available in the 
literature, see [IH], [H] and references therein. 

Existence of a stationary state and a large system limit 

We left unstudied the question under which conditions there exists a station- 
ary state of the clock. In particular this should be true in a large system 
limit and a study of the limit itself is even more important. Suppose that 
the family of states is given by independent copies of the same state, 

and let F{(p) be the Fisher information of p{ip). It is known that in the 
limit of large A^ this state can be represented in the vicinity of = by 
a Gaussian state with the Fisher information -F(O), see [HI [iD]- This is an 
instance of the quantum central limit theorem. In particular, with respect 
to the estimation theory, the example in Section [6] is generic in the large A^ 
limit. The question is whether this allows to draw a similar conclusion for 
the stationary state of the clock. 

Entropy production 

An atomic clock perpetually measures the quantum reference standard in or- 
der to synchronize the clock. This measurement causes entropy production 
on the quantum system, which can be related to a heat by the Landauer 
principle. To compute this entropy production during the stationary opera- 
tion was the original motivation of the author when approaching the subject 
of atomic clocks. However the subject turned out to be much wider and this 
particular question had to be postponed. 
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